Skip to content

Data Strategy

Lakea uses different strategies based on dataset size and API constraints.

TierUse caseImplementation
StaticSmall datasets, metadata, pre-aggregatedJSON at build time
DirectLarge datasets, public CORS-enabled APIsClient-side queries
ProxyAPIs with CORS issues or authCloudflare Worker

These APIs can be accessed directly from the client without a backend:

  • Protocol: TAP (ADQL via REST)
  • Auth: None required
  • CORS: No (requires server-side fetch)
  • Use case: Confirmed exoplanet data (~5,700 planets)
  • Tier: Static (fetched at build time via pnpm fetch:nasa)
  • Protocol: SQL queries via REST
  • Auth: None required
  • CORS: Yes
  • Use case: Galaxy photometry, spectra, images
  • Tier: Direct (query on demand)
  • Protocol: TAP (ADQL)
  • Auth: None required
  • CORS: Yes
  • Use case: Stellar positions, parallaxes, proper motions
  • Tier: Direct (async queries for large results, streaming for million+ rows)
  • Protocol: TAP (ADQL)
  • Auth: None required
  • CORS: Yes
  • Use case: Cross-matching with published catalogs
  • Tier: Direct

Load data based on viewport. As users pan or zoom, fetch only the visible region.

interface ViewportQuery {
ra: { min: number; max: number };
dec: { min: number; max: number };
magnitudeLimit?: number;
}

Start with aggregates, load details on zoom:

  1. Overview: Pre-aggregated density maps
  2. Region: Summary statistics per region
  3. Detail: Individual objects on high zoom

Offload heavy transforms to prevent UI blocking:

  • Coordinate conversions
  • Statistical calculations
  • Data filtering and sorting

For million+ row results (e.g., Gaia queries):

  1. Use async TAP queries
  2. Stream results in chunks
  3. Process and render incrementally
  4. Allow cancellation