GSData container¶
- class gsply.gsdata.GSData(means, scales, quats, opacities, sh0, shN, _format=<factory>, masks=None, mask_names=None, _base=None)[source]¶
Bases:
objectGaussian Splatting data container.
This container holds Gaussian parameters, either as separate arrays or as zero-copy views into a single base array for maximum performance. Implemented as a mutable dataclass with direct attribute access.
- Parameters:
- means¶
(N, 3) - xyz positions
- scales¶
(N, 3) - scale parameters - PLY format: log-scales (log(scale)) - LINEAR format: linear scales (scale)
- quats¶
(N, 4) - rotation quaternions
- opacities¶
(N,) - opacity values - PLY format: logit-opacities (logit(opacity)) - LINEAR format: linear opacities (opacity in [0, 1])
- sh0¶
(N, 3) - DC spherical harmonics (always SH format)
- shN¶
(N, K, 3) - Higher-order SH coefficients (K bands) (always SH format)
- masks¶
(N,) or (N, L) - Boolean mask layers for filtering (None = no masks)
- mask_names¶
list[str] - Names for each mask layer (None = unnamed layers)
- _base¶
(N, P) - Private base array (keeps memory alive for views, None otherwise)
- _format¶
FormatDict - Format tracking per attribute (type-safe TypedDict) - Format: {“scales”: DataFormat.SCALES_PLY, “opacities”: DataFormat.OPACITIES_PLY, …} - Scales: DataFormat.SCALES_PLY (log-scales) or DataFormat.SCALES_LINEAR (linear scales) - Opacities: DataFormat.OPACITIES_PLY (logit-opacities) or DataFormat.OPACITIES_LINEAR (linear opacities) - Colors: DataFormat.SH0_SH (sh0 as SH) or DataFormat.SH0_RGB (sh0 as RGB) - SH Order: DataFormat.SH_ORDER_0/1/2/3 (spherical harmonics degree for shN) - Positions/Rotations: DataFormat.MEANS_RAW (means) and DataFormat.QUATS_RAW (quats) - raw format - Always provided when creating GSData (auto-detected if not specified)
- Mask Layers:
Single layer: masks shape (N,), mask_names = None or [“name”]
Multi-layer: masks shape (N, L), mask_names = [“name1”, “name2”, …]
Use add_mask_layer() to add named layers
Use combine_masks() to merge layers with AND/OR logic
Use apply_masks() to filter data using mask layers
- Performance:
Zero-copy reads provide maximum performance
No memory overhead (views share memory with base)
Example
>>> data = plyread("scene.ply") >>> print(f"Loaded {len(data)} Gaussians") >>> # Add named mask layers >>> data.add_mask_layer("high_opacity", data.opacities > 0.5) >>> data.add_mask_layer("foreground", data.means[:, 2] < 0) >>> # Combine and apply >>> filtered = data.apply_masks(mode="and")
- property is_scales_ply: bool¶
Check if scales are in PLY format (log-scales).
- Returns:
True if scales are log-scales
- property is_scales_linear: bool¶
Check if scales are in linear format.
- Returns:
True if scales are linear
- property is_opacities_ply: bool¶
Check if opacities are in PLY format (logit-opacities).
- Returns:
True if opacities are logit-opacities
- property is_opacities_linear: bool¶
Check if opacities are in linear format [0, 1].
- Returns:
True if opacities are linear
- property is_sh0_sh: bool¶
Check if sh0 is in spherical harmonics format.
- Returns:
True if sh0 is in SH format
- property is_sh0_rgb: bool¶
Check if sh0 is in RGB color format.
- Returns:
True if sh0 is in RGB format
- property is_sh_order_0: bool¶
Check if SH degree is 0 (only sh0, no shN).
- Returns:
True if SH degree is 0
- property format_state: FormatDict¶
Get a read-only copy of the format state.
Returns a copy of the internal format dict for inspection. Use copy_format_from() to copy format between objects.
- Returns:
Copy of the format dict (modifications won’t affect original)
Example
>>> data = gsply.plyread("scene.ply") >>> fmt = data.format_state >>> print(fmt) # {'scales': DataFormat.SCALES_PLY, ...}
- copy_format_from(other)[source]¶
Copy format tracking from another GSData object.
This is the public API for copying format state between objects. Use this instead of directly accessing _format dict.
Example
>>> # After processing that might lose format >>> processed.copy_format_from(original)
- with_format(**updates)[source]¶
Create a copy with updated format settings.
Returns a new GSData with the same data but updated format dict. This is useful for explicitly setting format after operations.
- Parameters:
updates – Format updates (keys: scales, opacities, sh0, sh_order)
- Return type:
- Returns:
New GSData with updated format
Example
>>> # Mark data as having linear opacities after conversion >>> linear_data = data.with_format(opacities=DataFormat.OPACITIES_LINEAR)
- add_mask_layer(name, mask)[source]¶
Add a named boolean mask layer.
- Parameters:
- Raises:
ValueError – If mask shape doesn’t match data length or name already exists
- Return type:
Example
>>> data.add_mask_layer("high_opacity", data.opacities > 0.5) >>> data.add_mask_layer("foreground", data.means[:, 2] < 0) >>> print(data.mask_names) # ['high_opacity', 'foreground']
- get_mask_layer(name)[source]¶
Get a mask layer by name.
- Parameters:
name (
str) – Name of the mask layer- Return type:
- Returns:
Boolean array of shape (N,)
- Raises:
ValueError – If layer name not found
Example
>>> opacity_mask = data.get_mask_layer("high_opacity")
- remove_mask_layer(name)[source]¶
Remove a mask layer by name.
- Parameters:
name (
str) – Name of the mask layer to remove- Raises:
ValueError – If layer name not found
- Return type:
Example
>>> data.remove_mask_layer("foreground")
- combine_masks(mode='and', layers=None)[source]¶
Combine mask layers using boolean logic.
- Parameters:
- Return type:
- Returns:
Combined boolean mask of shape (N,)
- Raises:
ValueError – If no masks exist or invalid mode
Example
>>> # Combine all layers with AND >>> mask = data.combine_masks(mode="and") >>> filtered = data[mask] >>> >>> # Combine specific layers with OR >>> mask = data.combine_masks(mode="or", layers=["opacity", "foreground"])
- apply_masks(mode='and', layers=None, inplace=False)[source]¶
Apply mask layers to filter Gaussians.
- Parameters:
- Return type:
- Returns:
Filtered GSData (self if inplace=True, new object if inplace=False)
Example
>>> # Filter using all mask layers (AND logic) >>> filtered = data.apply_masks(mode="and") >>> >>> # Filter in-place using specific layers (OR logic) >>> data.apply_masks(mode="or", layers=["opacity", "scale"], inplace=True)
- consolidate()[source]¶
Consolidate separate arrays into a single base array.
This creates a _base array from separate arrays, which can improve performance for boolean masking operations and file writes.
Uses JIT-compiled parallel kernels for 2.8-5x faster interleaving compared to slice assignment.
- Return type:
- Returns:
New GSData with _base array, or self if already consolidated
Note
One-time cost: ~3ms per 400K Gaussians (JIT-optimized)
Benefit: 1.5x faster boolean masking, 36% faster writes
No benefit for slicing (actually slightly slower)
Use when doing many boolean mask operations or file writes
- copy()[source]¶
Return a deep copy of the GSData.
Creates independent copies of all arrays, ensuring modifications to the copy won’t affect the original data.
- Return type:
- Returns:
A new GSData object with copied arrays
- add(other)[source]¶
Concatenate two GSData objects along the Gaussian dimension.
Combines two GSData objects by stacking all Gaussians. Validates compatibility (same SH degree) and handles mask layer merging.
Performance: Highly optimized using pre-allocation + direct assignment - 1.10x faster for 10K Gaussians (412 M/s) - 1.56x faster for 100K Gaussians (106 M/s) - 1.90x faster for 500K Gaussians (99 M/s)
For GPU operations, use GSTensor.add() which is 18x faster on large datasets.
Note: For concatenating multiple arrays, use GSData.concatenate() which is 5.74x faster than repeated add() calls due to single allocation.
- Parameters:
other (
GSData) – Another GSData object to concatenate- Return type:
- Returns:
New GSData object with combined Gaussians
- Raises:
ValueError – If SH degrees don’t match or formats don’t match
Example
>>> data1 = gsply.plyread("scene1.ply") # 100K Gaussians >>> data2 = gsply.plyread("scene2.ply") # 50K Gaussians >>> combined = data1.add(data2) # 150K Gaussians >>> # Or use + operator >>> combined = data1 + data2 # Same result >>> print(len(combined)) # 150000
See also
concatenate: Bulk concatenation of multiple arrays (5.74x faster)
- static concatenate(arrays)[source]¶
Bulk concatenate multiple GSData objects.
Significantly more efficient than repeated add() calls: - Single allocation instead of N-1 intermediate allocations - 5.74x faster for concatenating 10 arrays - Reduces total memory copies
- Parameters:
arrays (
list[GSData]) – List of GSData objects to concatenate- Return type:
- Returns:
New GSData object with all Gaussians combined
- Raises:
ValueError – If list is empty, SH degrees don’t match, or formats don’t match
Example
>>> scenes = [gsply.plyread(f"scene{i}.ply") for i in range(10)] >>> combined = GSData.concatenate(scenes) # 5.74x faster than loop!
- Performance Comparison (10 arrays of 10K Gaussians):
>>> # Slow: Pairwise add() - 5.990 ms >>> result = scenes[0] >>> for scene in scenes[1:]: ... result = result.add(scene) >>> >>> # Fast: Bulk concatenate - 1.044 ms (5.74x faster!) >>> result = GSData.concatenate(scenes)
- make_contiguous(inplace=True)[source]¶
Convert all arrays to contiguous memory layout for better performance.
When data is loaded from PLY files via _base arrays, all field arrays (means, scales, etc.) are non-contiguous views with poor cache locality, causing 1.5-45x performance overhead for operations.
Conversion Cost (measured): - 1K Gaussians: 0.02 ms - 10K Gaussians: 0.14 ms - 100K Gaussians: 2.2 ms - 1M Gaussians: 25 ms
Per-Operation Speedup (100K Gaussians): - argmax(): 45.5x faster - max/min(): 18-19x faster - sum/mean(): 6-7x faster - std(): 2.7x faster - element-wise: 2-4x faster
Break-Even Analysis: - < 8 operations: DON’T convert (overhead not justified) - >= 8 operations: CONVERT (speedup outweighs cost) - >= 100 operations: CRITICAL (7.9x total speedup)
Real-World Scenarios (100K Gaussians): - Light processing (3 ops): 2.4x slower (DON’T convert) - Iterative processing (10x): 2.1x faster (CONVERT!) - Heavy computation (100x): 7.9x faster (CONVERT!)
Memory: Zero overhead (same total memory, just reorganized)
- Parameters:
inplace (
bool) – If True, modify arrays in-place and clear _base (default). If False, return new GSData with contiguous arrays.- Return type:
- Returns:
Self if inplace=True, new GSData if inplace=False
Example
>>> data = gsply.plyread("scene.ply") # Non-contiguous from _base >>> >>> # For few operations (< 8) - don't convert >>> total = data.means.sum() # Just use as-is >>> >>> # For many operations (>= 8) - convert first! >>> data.make_contiguous() # Up to 45x faster per operation >>> for i in range(100): ... result = data.means.sum() + data.means.max() # 7.9x faster!
See also
is_contiguous: Check if arrays are already contiguous
- is_contiguous()[source]¶
Check if all arrays are C-contiguous.
- Return type:
- Returns:
True if all arrays are contiguous, False otherwise
Example
>>> data = gsply.plyread("scene.ply") >>> print(data.is_contiguous()) # False (from _base) >>> data.make_contiguous() >>> print(data.is_contiguous()) # True
- unpack(include_shN=True)[source]¶
Unpack Gaussian data into tuple of arrays.
Convenient for standard Gaussian Splatting workflows that expect individual arrays rather than a container object.
- Parameters:
include_shN (
bool) – If True, include shN in output (default True)- Return type:
- Returns:
If include_shN=True: (means, scales, quats, opacities, sh0, shN), If include_shN=False: (means, scales, quats, opacities, sh0)
Example
>>> data = plyread("scene.ply") >>> means, scales, quats, opacities, sh0, shN = data.unpack() >>> # Use with rendering functions >>> render(means, scales, quats, opacities, sh0) >>> >>> # For SH0 data, exclude shN >>> means, scales, quats, opacities, sh0 = data.unpack(include_shN=False)
- to_dict()[source]¶
Convert Gaussian data to dictionary.
- Return type:
- Returns:
Dictionary with keys: means, scales, quats, opacities, sh0, shN
Example
>>> data = plyread("scene.ply") >>> props = data.to_dict() >>> # Access by key >>> positions = props['means'] >>> # Unpack dict values >>> render(**props)
- normalize(inplace=True)[source]¶
Convert linear scales/opacities to PLY format (log-scales, logit-opacities).
Converts: - Linear scales → log-scales: log(scale) with clamping - Linear opacities → logit-opacities: logit(opacity) with clamping
This is the standard format used in Gaussian Splatting PLY files. Use this when you have linear data and need to save to PLY format.
- Parameters:
inplace (
bool) – If True, modify this object in-place (default). If False, return new object.- Return type:
- Returns:
GSData object (self if inplace=True, new object otherwise)
Example
>>> # Data with linear scales and opacities >>> data = GSData(scales=[0.1, 0.2, 0.3], opacities=[0.5, 0.7, 0.9], ...) >>> # Convert to PLY format in-place (modifies data) >>> data.normalize() # or: data.normalize(inplace=True) >>> # Now ready to save with plywrite() >>> plywrite("output.ply", data) >>> >>> # Or create a copy if you need to keep original >>> ply_data = data.normalize(inplace=False)
- denormalize(inplace=True)[source]¶
Convert PLY format (log-scales, logit-opacities) to linear format.
Converts: - Log-scales → linear scales: exp(log_scale) with clamping - Logit-opacities → linear opacities: sigmoid(logit) - Quaternions → normalized quaternions
Use this when you load PLY files (which use log/logit format) and need linear values for computations or visualization.
- Parameters:
inplace (
bool) – If True, modify this object in-place (default). If False, return new object.- Return type:
- Returns:
GSData object (self if inplace=True, new object otherwise)
Example
>>> # Load PLY file (contains log-scales and logit-opacities) >>> data = plyread("scene.ply") >>> # Convert to linear format in-place (modifies data) >>> data.denormalize() # or: data.denormalize(inplace=True) >>> # Now scales and opacities are in linear space [0, 1] for opacities >>> print(f"Linear opacity range: [{data.opacities.min():.3f}, {data.opacities.max():.3f}]") >>> >>> # Or create a copy if you need to keep PLY format >>> linear_data = data.denormalize(inplace=False)
- to_rgb(inplace=True)[source]¶
Convert sh0 from spherical harmonics (SH) format to RGB color format.
Converts SH DC coefficients to RGB colors in [0, 1] range. Formula: rgb = sh0 * SH_C0 + 0.5
- Parameters:
inplace (
bool) – If True, modify this object in-place (default). If False, return new object.- Return type:
- Returns:
GSData object (self if inplace=True, new object otherwise)
Example
>>> # Load PLY file (sh0 is in SH format) >>> data = gsply.plyread("scene.ply") >>> # Convert to RGB format in-place >>> data.to_rgb() # or: data.to_rgb(inplace=True) >>> # Now sh0 contains RGB colors [0, 1] >>> print(f"RGB color range: [{data.sh0.min():.3f}, {data.sh0.max():.3f}]") >>> >>> # Or create a copy if you need to keep SH format >>> rgb_data = data.to_rgb(inplace=False)
- to_sh(inplace=True)[source]¶
Convert sh0 from RGB color format to spherical harmonics (SH) format.
Converts RGB colors in [0, 1] range to SH DC coefficients. Formula: sh0 = (rgb - 0.5) / SH_C0
- Parameters:
inplace (
bool) – If True, modify this object in-place (default). If False, return new object.- Return type:
- Returns:
GSData object (self if inplace=True, new object otherwise)
Example
>>> # Create GSData with RGB colors >>> rgb_colors = np.random.rand(1000, 3).astype(np.float32) >>> data = GSData(means=..., scales=..., sh0=rgb_colors, ...) >>> # Convert to SH format in-place >>> data.to_sh() # or: data.to_sh(inplace=True) >>> # Now sh0 contains SH DC coefficients >>> >>> # Or create a copy if you need to keep RGB format >>> sh_data = data.to_sh(inplace=False)
- copy_slice(key)[source]¶
Efficiently slice and copy in one operation.
For slices that return views, this is more efficient than data[key].copy() as it avoids creating intermediate view objects.
For boolean masks and fancy indexing, this simply delegates to __getitem__ since those already return copies.
- Parameters:
key – Slice key (slice, int, array, or boolean mask)
- Return type:
- Returns:
A new GSData object with copied sliced data
Examples
data.copy_slice(100:200) # Copy of elements 100-199 (avoids view) data.copy_slice(::10) # Copy of every 10th element (avoids view) data.copy_slice(mask) # Same as data[mask] (already a copy)
- get_gaussian(index)[source]¶
Get a single Gaussian as a GSData object.
Unlike direct indexing which returns a tuple for efficiency, this method returns a GSData object containing a single Gaussian.
- save(file_path, compressed=False)[source]¶
Save GSData to PLY file.
Convenience method that wraps plywrite() for object-oriented API.
- Parameters:
- Return type:
Example
>>> data = gsply.plyread("input.ply") >>> data.save("output.ply") # Uncompressed >>> data.save("output.ply", compressed=True) # Compressed
- classmethod load(file_path)[source]¶
Load GSData from PLY file.
Convenience classmethod that wraps plyread() for object-oriented API. Auto-detects compressed and uncompressed formats.
- Parameters:
- Return type:
- Returns:
GSData container with loaded data
Example
>>> data = GSData.load("scene.ply") # Auto-detect format >>> print(f"Loaded {len(data)} Gaussians")
- classmethod from_arrays(means, scales, quats, opacities, sh0, shN=None, format='auto', sh_degree=None, sh0_format=DataFormat.SH0_SH)[source]¶
Create GSData from individual arrays with format preset.
Convenient factory method for creating GSData from external arrays with automatic format detection or explicit format presets.
- Parameters:
means (
ndarray) – (N, 3) array - Gaussian centersscales (
ndarray) – (N, 3) array - Scale parametersquats (
ndarray) – (N, 4) array - Rotation quaternionsopacities (
ndarray) – (N,) array - Opacity valuessh0 (
ndarray) – (N, 3) array - DC spherical harmonicsshN (
ndarray|None) – (N, K, 3) array or None - Higher-order SH coefficientsformat (
str) – Format preset - “auto” (detect), “ply” (log/logit), “linear” or “rasterizer” (linear)sh_degree (
int|None) – SH degree (0-3) - auto-detected from shN if Nonesh0_format (
DataFormat) – Format for sh0 (SH0_SH or SH0_RGB), default SH0_SH
- Return type:
- Returns:
GSData object with specified format
Example
>>> # Auto-detect format from values >>> data = GSData.from_arrays(means, scales, quats, opacities, sh0) >>> >>> # Explicit PLY format (log-scales, logit-opacities) >>> data = GSData.from_arrays(means, scales, quats, opacities, sh0, format="ply") >>> >>> # Explicit linear format (for rasterizer) >>> data = GSData.from_arrays(means, scales, quats, opacities, sh0, format="linear")
- classmethod from_dict(data_dict, format='auto', sh_degree=None, sh0_format=DataFormat.SH0_SH)[source]¶
Create GSData from dictionary with format preset.
Convenient factory method for creating GSData from a dictionary with automatic format detection or explicit format presets.
- Parameters:
data_dict (
dict) – Dictionary with keys: means, scales, quats, opacities, sh0, shN (optional)format (
str) – Format preset - “auto” (detect), “ply” (log/logit), “linear” or “rasterizer” (linear)sh_degree (
int|None) – SH degree (0-3) - auto-detected from shN if Nonesh0_format (
DataFormat) – Format for sh0 (SH0_SH or SH0_RGB), default SH0_SH
- Return type:
- Returns:
GSData object with specified format
Example
>>> # From dictionary with auto-detection >>> data = GSData.from_dict({ ... "means": means, "scales": scales, "quats": quats, ... "opacities": opacities, "sh0": sh0, "shN": shN ... }) >>> >>> # Explicit PLY format >>> data = GSData.from_dict(data_dict, format="ply") >>> >>> # Explicit linear format >>> data = GSData.from_dict(data_dict, format="linear")